Leveraging Bagging for Evolving Data Streams
نویسندگان
چکیده
Bagging, boosting and Random Forests are classical ensemble methods used to improve the performance of single classifiers. They obtain superior performance by increasing the accuracy and diversity of the single classifiers. Attempts have been made to reproduce these methods in the more challenging context of evolving data streams. In this paper, we propose a new variant of bagging, called leveraging bagging. This method combines the simplicity of bagging with adding more randomization to the input, and output of the classifiers. We test our method by performing an evaluation study on synthetic and real-world datasets comprising up to ten million examples.
منابع مشابه
Improving Adaptive Bagging Methods for Evolving Data Streams
We propose two new improvements for bagging methods on evolving data streams. Recently, two new variants of Bagging were proposed: ADWIN Bagging and Adaptive-Size Hoeffding Tree (ASHT) Bagging. ASHT Bagging uses trees of different sizes, and ADWIN Bagging uses ADWIN as a change detector to decide when to discard underperforming ensemble members. We improve ADWIN Bagging using Hoeffding Adaptive...
متن کاملImproving Incremental Recommenders with Online Bagging
Online recommender systems often deal with continuous, potentially fast and unbounded flows of data. Ensemble methods for recommender systems have been used in the past in batch algorithms, however they have never been studied with incremental algorithms, that are capable of processing those data streams on the fly. We propose online bagging, using an incremental matrix factorization algorithm ...
متن کاملOnline Bagging for Recommendation with Incremental Matrix Factorization
Online recommender systems often deal with continuous, potentially fast and unbounded flows of data. Ensemble methods for recommender systems have been used in the past in batch algorithms, however they have never been studied with incremental algorithms, that are capable of processing those data streams on the fly. We propose online bagging, using an incremental matrix factorization algorithm ...
متن کاملCase Study on Bagging Stable Classifiers for Data Streams
Ensembles of classifiers are among the strongest classifiers in most data mining applications. Bagging ensembles exploit the instability of base-classifiers by training them on different bootstrap replicates. It has been shown that Bagging instable classifiers, such as decision trees, yield generally good results, whereas bagging stable classifiers, such as k-NN, makes little difference. Howeve...
متن کاملMOA: Massive Online Analysis
Massive Online Analysis (MOA) is a software environment for implementing algorithms and running experiments for online learning from evolving data streams. MOA includes a collection of offline and online methods as well as tools for evaluation. In particular, it implements boosting, bagging, and Hoeffding Trees, all with and without Naı̈ve Bayes classifiers at the leaves. MOA supports bi-directi...
متن کامل